Search CORE

179 research outputs found

Determining the optimal redistribution

Author: Herrmann Julien
Hérault Thomas
Marchal Loris
Robert Yves
Publication venue: HAL CCSD
Publication date: 18/03/2014
Field of study

The classical redistribution problem aims at optimally scheduling communications when moving from an initial data distribution \Dini to a target distribution \Dtar where each processor

P_{i}

will host a subset

P(i)

of data items. However, modern computing platforms are equipped with a powerful interconnection switch, and the cost of a given communication is (almost) independent of the location of its sender and receiver. This leads to generalizing the redistribution problem as follows: find the optimal permutation

\sigma

of processors such that

P_{i}

will host the set

P(\sigma(i))

, and for which the cost of the redistribution is minimal. This report studies the complexity of this generalized problem. We provide optimal algorithms and evaluate their gain over classical redistribution through simulations. We also show the NP-hardness of the problem to find the optimal data partition and processor permutation (defined by new subsets

P(\sigma(i))

) that minimize the cost of redistribution followed by a simple computation kernel.Le problème de redistribution classique consiste à ordonnancer les communications de manière optimale lorsque l'on passe une distribution de données initiale \Dini à une distribution cible \Dtar où chaque processeur

P_{i}

héberge un sous-ensemble

P(i)

des données. Cependant, les plates-formes de calcul modernes sont équipées de puissants réseaux d'interconnexion programmables, et le coût d'une communication donnée est (presque) indépendant de l'emplacement de l'expéditeur et du récepteur. Cela conduit à généraliser le problème de redistribution comme suit: trouver la permutation optimale

\sigma

de processeurs telle que

P_{i}

héberge l'ensemble

P(\sigma(i))

, et telle que le coût de redistribution soit minimal. Ce rapport étudie la complexité de ce problème généralisé. Nous proposons des algorithmes optimaux et évaluons leur gain par rapport à la redistribution classique, via quelques simulations. Nous montrons aussi la NP-completude du problème consistant à trouver la partition de données optimale et la permutation des processeurs (définie par les nouveaux sous-ensembles

P(\sigma(i))

) qui minimise le coût de la redistribution suivie d'un noyau de calcul simple

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Modelling the Holtenau Ship Lock with SPH

Author: Bilotta Giuseppe
Brudy-Zippelius Thomas
Hérault Alexis
Rustico Eugenio
Publication venue
Publication date: 01/01/2014
Field of study

Experimental and Computational Hydraulic

Hydraulic Engineering Repository

Optimal Checkpointing Period: Time vs. Energy

Author: Aupy Guillaume
Benoit Anne
Dongarra Jack
Hérault Thomas
Robert Yves
Publication venue: HAL CCSD
Publication date: 01/11/2013
Field of study

International audienceThis short paper deals with parallel scientific applications using non-blocking and periodic coordinated checkpointing to enforce resilience. We provide a model and detailed formulas for total execution time and consumed energy. We characterize the optimal period for both objectives, and we assess the range of time/energy trade-offs to be made by instantiating the model with a set of realistic scenarios for Exascale systems. We give a particular emphasis to I/O transfers, because the relative cost of communication is expected to dramatically increase, both in terms of latency and consumed energy, for future Exascale platforms

INRIA a CCSD electronic archive server

Hierarchical QR factorization algorithms for multi-core cluster systems

Author: Jack Dongarra
Julien Langou
Mathias Jacquelin
Mathieu Faverge
Thomas Hérault
Yves Robert
Publication venue
Publication date: 01/01/2013
Field of study

This paper describes a new QR factorization algorithm which is especially designed for massively parallel platforms combining parallel distributed nodes, where a node is a multi-core processor. These platforms represent the present and the foreseeable future of high-performance computing. Our new QR factorization algorithm falls in the category of the tile algorithms which naturally enables good data locality for the sequential kernels executed by the cores (high sequential performance), low number of messages in a parallel distributed setting (small latency term), and fine granularity (high parallelism). Each tile algorithm is uniquely characterized by its sequence of reduction trees. In the context of a cluster of nodes, in order to minimize the number of inter-processor communications (aka, ''communication-avoiding''), it is natural to consider hierarchical trees composed of an ''inter-node'' tree which acts on top of ''intra-node'' trees. At the intra-node level, we propose a hierarchical tree made of three levels: (0) ''TS level'' for cache-friendliness, (1) ''low-level'' for decoupled highly parallel inter-node reductions, (2) ''domino level'' to efficiently resolve interactions between local reductions and global reductions. Our hierarchical algorithm and its implementation are flexible and modular, and can accommodate several kernel types, different distribution layouts, and a variety of reduction trees at all levels, both inter-node and intra-node. Numerical experiments on a cluster of multi-core nodes (i) confirm that each of the four levels of our hierarchical tree contributes to build up performance and (ii) build insights on how these levels influence performance and interact within each other. Our implementation of the new algorithm with the DAGUE scheduling tool significantly outperforms currently available QR factorization software for all matrix shapes, thereby bringing a new advance in numerical linear algebra for petascale and exascale platforms

CiteSeerX

Comparing Distributed Termination Detection Algorithms for Task-Based Runtime Systems on HPC platforms

Author: Bosilca George
Bouteiller Aurélien
Dongarra Jack,
Hérault Thomas
Le Fèvre Valentin
Robert Yves
Publication venue: Higashi Hiroshima : Dept. of Computer Engineering, Hiroshima University
Publication date: 01/01/2022
Field of study

International audienceThis paper revisits distributed termination detection algorithms in the context of High-Performance Computing (HPC) applications. We introduce an efficient variant of the Credit Distribution Algorithm (CDA) and compare it to the original algorithm (HCDA) as well as to its two primary competitors: the Four Counters algorithm (4C) and the Efficient Delay-Optimal Distributed algorithm (EDOD). We analyze the behavior of each algorithm for some simplified task-based kernels and show the superiority of CDA in terms of the number of control messages. We then compare the implementation of these algorithms over a task-based runtime system, PaRSEC and show the advantages and limitations of each approach on a practical implementation

INRIA a CCSD electronic archive server

Efficient Parallel Statistical Model Checking of Biochemical Networks

Author: A. Pnueli
A. S. Miner
Adnan Aziz
B. Novak
Christel Baier
D. Donaldson R.
D.O. Morgan
D.O. Morgan
D.T. Gillespie
D.T. Gillespie
Davide Prandi
E. B. Wilson
Edmund M Clarke
Edmund M. Clarke
Fran¸ cois Fages
H. A. Hansson
H. Kitano
H. Li
H. Younes
J.-P. Katoen
Jaco van de Pol
Jiv r'ı Barnat
L. Dematte
Laurence Calzone
Lawrence D. Brown
Lawrence D. Brown
Lubos Brim
M. Kwiatkowska
M. Kwiatkowska
M. Scarpa
Michele Forlin
P. Ballarini
P. Ballarini
Paolo Ballarini
T. Tian
Thomas Hérault
Tommaso Mazza
Walter W. Piegorsch
William J. Stewart
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2009
Field of study

We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Nut production in Bertholletia excelsa across a logged forest mosaic: implications for multiple forest use

Author: A Chávez
A Lawrence
A Lawrence
A Malmer
AD Johns
AE Duchelle
AE Duchelle
AE Duchelle
AE Duchelle
AJ Plumptre
B Hérault
Betxy Tabita Villarroel Panduro
BW Griscom
BW Griscom
C Baraloto
C Baraloto
C Benneker
C Cerdan Rojas
C García-Fernández
C Herrero-Jáuregui
C Padoch
C Sabogal
CA Klimas
CA Peres
CA Peres
CA Rockwell
Cara A. Rockwell
CL Staudhammer
CP Van Schaik
D Pearce
D Sheil
DP Dykstra
E Lima
E Ortiz
E Vidal
Edwin Eduardo Jurado Rojas
Eleanor Warren-Thomas
Eriks Arroyo Quispe
FE Putz
GH Shepard Jr
GS Amacher
GS Hartshorn
Harol Fernandez Silva
J Ghazoul
J Ghazoul
J Nabe-Nielsen
J Quaedvlieg
J Terborgh
JC Licona
JJ Scullion
JM Pires
JMT Haugaasen
Jonatan Frank Valera Tito
José Andrés Hideki Kohagura Arrunátegui
JQ Chambers
JS Denslow
JS Johns
Juan José Yucra Salas
Julia Quaedvlieg
JW Clay
KA Kainer
KA Kainer
KA Kainer
Kamal Bawa
L Rist
LB Fortini
LHO Wadt
LHO Wadt
Luis Alberto Meza Vega
M Menton
M Pinard
M Pinedo-Vasquez
M Schulze
M Schulze
M Soriano
M Tobler
M Zenteno
M Zenteno
M Zenteno
MA Pinard
MAF Ros-Tonen
Manuel R. Guariguata
Mary Menton
MC Cavalcante
MG Fonseca
MR Guariguata
MR Guariguata
MR Guariguata
MR Trivedi
MW Tobler
OL Phillips
Olivia Revilla Vera
P Barreto
P Cronkleton
P Shanley
P Shearman
P Sist
PA Zuidema
PB Camargo
PM Fearnside
PS Sujii
R Giudice
RE Cossío-Solano
RJW Brienen
Roger Quenta Hancco
RP Salomão
RR Sears
S Humphries
S Shackleton
S Vieira
SA Mori
SA Mori
SG Perz
SK Pattanayak
SV Coslovsky
T Haugaasen
T Panayotou
TJ Synnott
TP Holmes
TS Fredericksen
V Garrish
VM Viana
WC Cano
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 13/08/2015
Field of study

Although many examples of multiple-use forest management may be found in tropical smallholder systems, few studies provide empirical support for the integration of selective timber harvesting with non-timber forest product (NTFP) extraction. Brazil nut (Bertholletia excelsa, Lecythidaceae) is one of the world’s most economically-important NTFP species extracted almost entirely from natural forests across the Amazon Basin. An obligate out-crosser, Brazil nut flowers are pollinated by large-bodied bees, a process resulting in a hard round fruit that takes up to 14 months to mature. As many smallholders turn to the financial security provided by timber, Brazil nut fruits are increasingly being harvested in logged forests. We tested the influence of tree and stand-level covariates (distance to nearest cut stump and local logging intensity) on total nut production at the individual tree level in five recently logged Brazil nut concessions covering about 4000 ha of forest in Madre de Dios, Peru. Our field team accompanied Brazil nut harvesters during the traditional harvest period (January-April 2012 and January-April 2013) in order to collect data on fruit production. Three hundred and ninety-nine (approximately 80%) of the 499 trees included in this study were at least 100 m from the nearest cut stump, suggesting that concessionaires avoid logging near adult Brazil nut trees. Yet even for those trees on the edge of logging gaps, distance to nearest cut stump and local logging intensity did not have a statistically significant influence on Brazil nut production at the applied logging intensities (typically 1–2 timber trees removed per ha). In one concession where at least 4 trees ha-1 were removed, however, the logging intensity covariate resulted in a marginally significant (0.09) P value, highlighting a potential risk for a drop in nut production at higher intensities. While we do not suggest that logging activities should be completely avoided in Brazil nut rich forests, when a buffer zone cannot be observed, low logging intensities should be implemented. The sustainability of this integrated management system will ultimately depend on a complex series of socioeconomic and ecological interactions. Yet we submit that our study provides an important initial step in understanding the compatibility of timber harvesting with a high value NTFP, potentially allowing for diversification of forest use strategies in Amazonian Perù

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

DigitalCommons@Florida International University

CGSpace

University of East Anglia digital repository

Sussex Research Online

Positive biodiversity-productivity relationship predominant in global forests

Author: Alberti G.
Ammer C.
Barrett C.B.
Bozzato F.
Brandl S.
Bruelheide H.
Chen H.Y.H.
Coomes David A.
Crowther Thomas Ward
De-Miguel Sergio
Fischer Markus
Gianelle D.
Glick H.B.
Gourlet-Fleury S.
Hengeveld G.M.
Hérault Bruno
Kim H.S.
Kitahara F.
Lee B.
Lee E.
Lei X.
Liang J.
Lu H.
McGuire A.D.
Nabuurs G.J.
Paquette A.
Parfenova E.I.
Pfautsch S.
Picard N.
Piotto D.
Pretzsch H.
Salas C.
Schall P.
Schelhaas M.J.
Scherer-Lorenzen Michael
Schmid B.
Schulze E.D.
Sonké B.
Sunderland T.
Tavani R.
Tchebakova N.
Valladares Ros Fernando
Vayreda J.
Verbyla D.
Viana H.
Vibrans A.C.
Watson J.V.
Wiser S.
Zhou M.
Zhu J.
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 12/09/2017
Field of study

The biodiversity-productivity relationship (BPR) is foundational to our understanding of the global extinction crisis and its impacts on ecosystem functioning. Understanding BPR is critical for the accurate valuation and effective conservation of biodiversity. Using ground-sourced data from 777,126 permanent plots, spanning 44 countries and most terrestrial biomes, we reveal a globally consistent positive concave-down BPR, showing that continued biodiversity loss would result in an accelerating decline in forest productivity worldwide. The value of biodiversity in maintaining commercial forest productivity alone - US$166 billion to 490 billion per year according to our estimation - is more than twice what it would cost to implement effective global conservation. This highlights the need for a worldwide reassessment of biodiversity values, forest management strategies, and conservation priorities.Peer Reviewe

Digital.CSIC